The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
This paper presents a novel method to estimate characteristics of information sources about a topic by analyzing their information diffusion subnetworks in blogspace. In an information diffusion network, each influential information source has an affected subnetwork whose nodes are reachable from it. We define three information diffusion properties of the subnetwork using the numbers of three types...
The abundance of content on the web and the lack of quality control require more refined approaches in analyzing online information. In this paper, we propose evaluating the extent to which web search results cover important and recent news related to real-world objects. Our method allows for identifying search results that provide comprehensive overviews of major events related to user queries or...
Biomedical entity extraction from unstructured web documents is an important task that needs to be performed in order to discover knowledge in the veterinary medicine domain. In general, this task can be approached by applying domain specific ontologies, but a review of the literature shows that there is no universal dictionary, or ontology for this domain. To address this issue, we manually construct...
A great share of current sentiment analysis techniques is based on special purpose lexicons providing information about the semantic orientation (e.g. positive, negative, neutral) of its entries. Due to the high labor costs of manually assembling such resources, recent work has focused on automatically inducing the polarity of given terms. We follow this line of work while focusing on the domain of...
Reuse is an important mechanism for improving the efficiency of software development. For Internet-scale software produced through service composition, the simple reuse granularity at service is often inefficient due to the large number of available services. This paper proposes a novel architecture which enables efficient reuse of process fragments. In the proposed architecture, services are organized...
This paper intends to present a straightforward, extensive, and noise resistant method for efficiently tagging a web query, submitted to a search engine, with proper category labels. These labels are intended to represent the closest categories related to the query which can ultimately be used to enhance the results of any typical search engine by either restricting the results to matching categories...
A challenge for personalised recommender systems is to target products in the long tail. That is, to recommend products that the end-user likes, but that are not generally popular. To achieve this goal, in this paper we propose two strategies to identify relevant but niche products. The first strategy computes an inverse item popularity and applies it during the steps of top-N recommendation. Given...
Keyword-based search engines often return an unexpected number of results. Zero hits are naturally undesirable, while too many hits are likely to be overwhelming and of low precision. We present an approach for predicting the number of hits for a given set of query terms. Using word frequencies derived from a large corpus, we construct random samples of combinations of these words as search terms...
We propose using multi-layer multiple instance learning (MMIL) for image set classification and applying it to the task of cannabis website classification. We treat each image as an instance in an image set, then each image is further viewed as containing instances of local image patches. This representation naturally extends traditional multiple instance learning (MIL) to multi-layers. We then show...
If we imagine a dynamic environment whose behavior may change in time we can figure out the difficulties that agents located there will have trying to solve problems related to this environment. Changes in an environment e.g. a market, can be quite drastic: from changing the dependencies of some products to add new actions to build new products. The agents should try to cooperate or compete against...
Computational trust propagation is an important method for the establishment of trust in strangers. In ad-hoc or P2P networks, such an approach allows to choose trusted nodes for routing, data storage, or computation, even if the choosing node has not had previous experiences with the considered nodes. Human trust propagation can occur through a variety of phenomena, such as recommendation of trusted...
The indexing vocabulary is an important determinant of success in text retrieval. Researchers have compared the effectiveness of indexing strategies using free-text and controlled vocabularies in a variety of text contexts. This paper introduces a new approach in creating indices for medical literature using the Semantic Graph (SG) derived from both UMLS Metathesaurus and WordNet. Our performance...
Many real world phenomena can be naturally modeled as graph structures whose nodes representing entities and whose edges representing interactions or relationships between entities. The analysis of the graph data have many practical implications. However, the release of the data often poses considerable privacy risk to the individuals involved. In this paper, we address the edge privacy problem in...
In this paper, a new approach in decision making process inspired by human visual cortex has been proposed. In this approach knowledge of a group of agents (training data) will be used for decision-making. The proposed approach tries to meet two fundamental features, i.e., robustness and specificity. The hierarchical model that has been represented in this work, tries to extract the knowledge about...
In this paper, we propose a Relation Expansion framework, which uses a few seed sentences marked up with two entities to expand a set of sentences containing target relations. During the expansion process, label propagation algorithm is used to select the most confident entity pairs and context patterns. The label propagation algorithm is a graph based semi-supervised learning method which models...
Navigation of a student over learning objects can be solved by using concept space of a domain which contains information about structural and dependency relations between concepts. Every learning object has defined relations to several concepts. Relations which exist in the learning object space also exist in the concept space between the corresponding concepts. We present an evaluation of controlled...
In the article there is presented comparison of overlapping clustering methods for data mining of DBLP datasets. For the analysis, the DBLP data sets were pre-processed, while each journal has been assigned attributes, defined by its topics. The data collection can be described as vague and uncertain; obtained clusters and applied queries do not necessarily have crisp boundaries. The authors presented...
Clustering- an important data mining task, which groups the data on the basis of similarities among the data, can be divided into two broad categories, partitional clustering and hierarchal. We combine these two methods and propose a novel clustering algorithm called Hierarchical Particle Swarm Optimization (HPSO) data clustering. The proposed algorithm exploits the swarm intelligence of cooperating...
The Distributed Stochastic Algorithm (DSA), Distributed Breakout Algorithm (DBA), and variations such as Distributed Simulated Annealing (DSAN), MGM-1, and DisPeL, are distributed hill-climbing techniques for solving large Distributed Constraint Optimization Problems (DCOPs) such as distributed scheduling, resource allocation, and distributed route planning. Like their centralized counterparts, these...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.